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Abstract 

As the first step in an investigation of the origin of genetic infor- 
mation, we study how some species of molecules are preserved over 
cell generations and play an important role in controlling the growth 
of a cell. We consider a model consisting of protocells. Each protocell 
contains two mutually catalyzing molecule species {X and Y), each of 
which has catalytically active and inactive types. One of the species Y 
is assumed to have a slower synthesis speed. Through divisions of the 
protocells, the system reaches and remains in a state in which there 
are only a few active Y and almost no inactive Y molecules in most 
protocells, through selection of very rare fluctuations. In this state, 
the active Y molecules are shown to control the behavior of the proto- 
cell. The minority molecule species act as the information carrier, due 
to the relatively discrete nature of its population, in comparison with 
the majority species which behaves statistically in accordance with 
the law of large numbers. The relevance of this minority controlled 
state to evolvability is discussed. 

Key words: chicken and egg problem, genetic information, minority control, 
evolvability 
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1 Introduction 



The origin of genetic information in a replicating system is an important 
theoretical topic that should be studied, not necessarily as a property of 
certain molecules, but as a general property of replicating systems. Consider 
a simple prototype cell that consists of mutually catalyzing molecule species 
whose intra-cellular population growth results in cell reproduction. In this 
protocell, the molecules that carry the genetic information are not initially 
specified, and to realize the growth in molecule numbers alone, it may not 
be necessary for specific molecules carrying such information to exist. 

In actual cells, however, it is generally believed that information is en- 
coded in DNA, which controls the behavior of a cell. With regard to this 
point, though it is not necessary to take a strong 'geno-centric' standpoint, 
it cannot be denied that there exists a difference between DNA and protein 
molecules in the role of information carrier. Still, even in actual cells, pro- 
teins and DNA both possess catalytic ability, and catalyze the production of 
each other, leading to cell replication.^ Then, why is DNA regarded as the 
carrier of information? 

To investigate this problem we need to clarify what it means for something 
to be an information carrier. For information carrying molecules, we identify 
the following two features as necessary. 

(1) If this molecule is removed or replaced by a mutant, there is a strong 
influence on the behavior of the cell. We refer to this as the 'control property'. 

(2) Such molecules are preserved well over generations. The number of 
such molecules exhibits smaller fluctuations than that of other molecules, 
and their chemical structure (such as polymer sequence) is preserved over 
a long time span, even under potential changes by fluctuations through the 
synthesis of these molecules. We refer to this as the 'preservation property'. 

According to the present understanding of [Alberts et al. 1997], changes 
undergone by DNA molecules are believed to exercise stronger influences on 
the behavior of cells than other chemicals. With a higher catalytic activity, a 
DNA molecule has a stronger influence on the behavior of a cell. Also, a DNA 
molecule is transferred to offspring cells relatively accurately, compared with 
other constitutes of the cell. Hence a DNA molecule satisfies the properties 

^ Note that through the reaction process from DNA to messenger RNA, and then to 
the synthesis of proteins, a DNA molecule itself is not changed. As a result, DNA works 
as a catalyst for the synthesis of proteins. 
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(1) and (2). 

In addition, a DNA molecule is stable, and the time scale for the change of 
DNA, e.g., its replication process as well as its destruction process, is slower. 
Because of this relatively slow replication, the number of DNA molecules is 
smaller than the number of protein molecules. For one generation of cells, sin- 
gle replication of each DNA molecule occurs typically, while other molecules 
undergo more replications (and decompositions). 

The question we address in the present paper is as follows. Consider 
a protocell with mutually catalyzing molecules. Then, under what condi- 
tions, does one molecule species begin to carry information in the sense of 
(1) and (2)? We show, under rather general conditions in our model of mu- 
tually catalyzing system, that a symmetry breaking between the two kinds 
of molecules takes place, and through replication and selection, one kind of 
molecule comes to satisfy the conditions (1) and (2). 

Without assuming the detailed biochemical properties of DNA, we seek a 
general condition for the differentiation of the roles of molecules in a cell and 
study the origin of the controlling behavior of some molecules. Assuming 
only a difference in the synthesis speeds of the two kinds of molecules, we 
show that the species that eventually possesses a smaller population satisfies 
(1) and (2) and acts as an information carrier. With this approach, we discuss 
the origin of information from a kinetic viewpoint. Note that we consider this 
information problem at a minimal level, i.e., as the origin of 1-bit information 
in a replicating cell system. 

In the present paper we consider a very simple protocell system (see 
Fig.l), consisting of two species of replicating molecules that catalyze each 
other. Each species has active and inactive molecule types, with only the 
active types of one species catalyzing the replication of the the other species 
for the replication. The rate of rephcation is different for the two species. We 
consider the behavior of a system with such mutually catalyzing molecules 
with different replication speeds, as a first step in answering the question 
posed above. We show that the molecule species with slower replication 
speed comes to possess the properties (1) and (2), and that it therefore comes 
to represent the information carrier for cell replication. Finally, we discuss 
why a system with a separation of roles between information carrying and 
metabolism has a higher evolvability (i.e., ability to evolve), in reference to 
a recent experiment on an artificial replication system. 
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2 Toy Model 



To study the general features of a system with mutually catalyzing molecules, 
we consider the following minimal model. First, we envision a (proto)cell 
containing molecules. With a supply of chemicals available to the cell, these 
molecules replicate through catalytic reactions, so that their numbers within 
a cell increase. When the total number of molecules exceeds a given thresh- 
old, the cell divides into two, with each daughter cell inheriting half of the 
molecules of the mother, chosen randomly. Regarding the chemical species 
and the reaction, we make the following simplifying assumptions: 

(i) There are two species of molecules, X and Y, which are mutually 
catalyzing. 

(ii) For each species, there are active (A) and inactive (I) types. There 
are thus four types, X^, , Y"^, and . The active type has the ability to 
catalyze the replication of both types of the other species of molecules. The 
catalytic reactions for replication are assumed to take the formQ 

X-^ + ^ 2X-^ + Y^ (for J = Aor I) 
and 

YJ + X^^ 2Y-^ + (for J = A or /). 

(iii) The rates of synthesis (or catalytic activity) of the molecules X and 
Y differ. We stipulate that the rate of the above replication process for Y, 
jy, is much smaller than that for X, j^- This difference in the rates may also 
be caused by a difference in catalytic activities between the two molecule 
species. 

(iv) It is natural to assume that the active molecule type is rather rare. 
With this in mind, we assume that there are F types of inactive molecules 
per active type. For most simulations, we consider the case in which there is 
only one type of active molecules for each species. 

(v) In the replication process, there may occur structural changes that 
alter the activity of molecules. Therefore the type (active or inactive) of 
a daughter molecule can differ from that of the mother. The rate of such 
structural change is given by n, which is not necessarily small, due to ther- 
modynamic fluctuations. This change can consist of the alternation of a 
sequence in a polymer or other conformational change, and may be regarded 

^More precisely, there is a supply of precursor molecules for the synthesis of X and Y, 
and the replication occurs with catalytic influence of either X'^ or Y^. 
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as replication 'error'. Note that the probabihty for the loss of activity is F 
times greater than for its gain, since there are F times more types of inactive 
molecules than active molecules. Hence, there are processes described by 
X^;and (with rate /i) 

X^;and Y^ r^(with rate /xF), 
resulting from structural change. 

(vi) When the total number of molecules in a protocell exceeds a given 
value 2A'", it divides into two, and the chemicals therein are distributed into 
the two daughter cells randomly, with iV molecules going to each. Subse- 
quently, the total number of molecules in each daughter cell increases from 

to 2A^, at which point these divide. 

(vii) To include competition, we assume that there is a constant total 
number M^ot of protocells, so that one protocell, randomly chosen, is removed 
whenever a (different) protocell divides into two. 

With the above described process, we have basically four sets of parame- 
ters: the ratio of synthesis rates ^y/'^xi the error rate /x, the fraction of active 
molecules and the number of molecules N. (The number Mf^t is not 

important, as long as it is not too small). 

We carried out simulation of this model, according to the following pro- 
cedure. First, a pair of molecules is chosen randomly. If these molecules 
arc of different species, then if the X molecule is active, a new Y molecule 
is produced with the probability 7^^, and if the Y molecule is active, a new 
X molecule is produced with the probability 7-^. Such replications occur 
with the error rates given above. All the simulations were thus carried out 
stochastically, in this manner. 

We consider a stochastic model rather than the corresponding rate equa- 
tion, which is valid for large N , since we are interested in the case with 
relatively small N . This follows from the fact that in a cell, often the num- 
ber of molecules of a given species is not large, and thus the continuum limit 
implied in the rate equation approach is not necessarily justified [Hess and 
Mikhailov 1994, 1995; Stange, Mikhailov, and Hess 1998]. Furthermore, it 
has recently been found that the discrete nature of a molecule population 
leads to qualitatively different behavior than in the continuum case in a sim- 
ple autocatalytic reaction network [Togashi and Kaneko 2001]. 
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3 Result 



If N is very large, the above described stochastic model can be replaced by 
a continuous model given by the rate equation. Then the growth dynamics 
of the number of molecules and Ny (for J = ^4 or /) is described by the 
rate equations 

dN^/dt = ^.N^N^; dN^/dt = ^yN^Ny'. (1) 

From these equations, under repeated divisions, it is expected that the rela- 
tions ^ = — , ^ = j;, and ^ = -p are eventually satisfied. Indeed, even 
with our stochastic simulation, this number distribution is approached as N 
is increased. 

However, when is small, and with the selection process, there appears 
a significant deviation from the above distribution. In Fig. 2, wc have plotted 
the average numbers {N^), {N^), {Ny), and {Ny). Here, (...) represents the 
average over time of the number of the molecules of the individual species, 
existing in a cell just prior to the division, when the total number of molecules 
is 2A'", averaged over all observed divisions throughout the system. (Accord- 
ingly, a cell removed without division docs not contribute to the average). 
As shown in the figure, there appears a state satisfying (A^) ^ 2 — 10, 

(A^^) ^ 0. Since F ^ 1, such a state with -^^jy > 1 is not expected from 

the rate equation (1). Indeed, for the X- species, the number of inactive 
molecules is much larger than the number of active ones. Hence, we have 
found a novel state that can be realized due to the smallness of the number 
of molecules and the selection process. 

In Fig. 2, 7y/7x and F are fixed to 0.01 and 64, respectively, while the 
dependence of {{N^) ,{N^) ,{Ny) ,{Ny) } on these parameters is plotted in 
Figs. 3 and 4. As shown in these figures, the above mentioned state with 
{Ny) fa 2 — 10, {Ny) < 1 is reached and sustained when 7^/71 is small and 
F is sufficiently large. In fact, for most dividing cells. A/ is exactly 0, while 
there appear a few cells with A^ > 1 from time to time. It should be noted 
that the state with almost no inactive Y molecules appears in the case of 
larger F, i.e., in the case of a larger possible variety of inactive molecules. 
This suppression of for large F contrasts with the behavior found in the 

continuum limit (the rate equation). In Fig.4, we have plotted -t^^ as a 
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function of F. Up to some value of F, the proportion of active Y molecules 
decreases, in agreement with the naive expectation provided by Eq. (1), but 
this proportion increases with further increase of F, in the case that "yy/^x 
is small (~ .02) and N is small. 

This behavior of the molecular populations can be understood from the 
viewpoint of selection: In a system with mutual catalysis, both X"^ and 
are necessary for the replication of protocells to continue. The number of 
Y molecules is expected to be rather small, since their synthesis speed is 
much slower than that of X molecules. Indeed, the fixed point distribution 
given by the continuum limit equations possesses a rather small N:^. In fact, 
when the total number of molecules is sufficiently small, the value of (N;^) 
given by these equations is less than 1. However, in a system with mutual 
catalysis, both X"^ and Y"^ must be present for rephcation of protocells to 
continue. In particular, for the replication of X molecules to continue, at 
least a single active Y molecule is necessary. Hence, if iV^ vanishes, only 
the replication of inactive Y molecules occurs. For this reason, divisions 
producing descendants of this cell cannot proceed indefinitely, because the 
number of X"^ molecules is cut in half at each division. Thus, a cell with 
Ny < 1 cannot leave a continuing line of descendant cells. Also, for a cell 
with = 1, only one of its daughter cells can have an active Y molecule. 
Hence a cell with = 1 has no potentiality to multiple through division, 
and for this reason, given the presence of cells with > 1 and selection, 
the number of cells with Ny — 1 should decrease with time. We thus see 
that over a sufficiently long time, protocells with Ny > 1 are selected. 

The total number of Y molecules is limited to small values, due to their 
slow synthesis speed. This implies that a cell that suppresses the number of 
Y^ molecules to be as small as possible is preferable under selection, so that 
there is a room for molecules. Hence, a state with almost no Y^ molecules 
and a few Y"^ molecules, once realized through fluctuations, is expected to 
be selected through competition for survival. 

Of course, the fluctuations necessary to produce such a state decrease 
quite rapidly as the total molecule number increases, and for sufficiently large 
numbers, the continuum description of the rate equation is valid. Clearly 
then, a state of the type described above is selected only when the total 
number of molecules within a protocell is not too large. In fact, a state 
with very small Ny appears only if the total number N is smaller than some 
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threshold value depending on F and '-^y. 

To summarize our result, we have found that a state with a few active Y 
molecules and very small number of inactive Y molecules is selected if the 
replication of Y molecules is much slower than that of X, a large variety of 
inactive molecules exists, and the total number of molecules is sufficiently 
small. 

Remark: In the model considered here, we have included a mechanism 
for the synthesis of molecules, but not for their decomposition. To inves- 
tigate the effect of the decomposition of molecules, we have also studied a 
model including a process to remove molecules randomly at some rate. We 
found that the above stated conclusion is not altered by the inclusion of this 
mechanism. 

4 Minority Controlled State 

In §3, we showed that in a mutually catalyzing replication system, the se- 
lected state is one in which the number of inactive molecules of the slower 
replicating species, Y, is drastically suppressed. In this section, we first show 
that the fiuctuations of the number of active Y molecules is smaller than 
those of active X molecules in this state. Next, we show that the molecule 
species Y (the minority species) becomes dominant in determining the growth 
speed of the (proto)cell system. Then, considering a model with several ac- 
tive molecule types, the control of chemical composition through specificity 
symmetry breaking is demonstrated. 

4.1 Control of the growth speed 

First, we computed the time evolution of the number of active X and Y 
molecules, to see if the selection process acts more strongly to control the 
number of one or the other. We computed and Ny at every division 
to obtain the histograms of cells with given numbers of active molecules. 
(Here, the values of A'^^ were coarse-grained into bins of size 10, chosen 
as [0,10], [10, 20],..., while all possible values of A^^, 1,2,..., were computed 
separately. The histograms for and were computed independently. 

The histograms are plotted in Fig. 6a. We see that the distribution for 
Ny has a sharp peak near Ny — 2, while that for N^ is much wider. Since 
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the root mean square of a distribution increases with the square root of 
the average for a standard random process, we have plot ted the h istograms 



by rescahng the ordinate by the expected average of ^N^/N^, which is 

approximately 10. Even after this rcscaling, we find that the distribution 
of is much wider than that of N^. Hence, the fluctuations in the value 
of N,^ are much smaller than those of A'^^^. We conclude that the selection 
process discriminates more strongly between different concentrations of active 
Y molecules than between those of active X molecules. Hence, it is expected 
that the growth speed of our protocell has a stronger dependence on the 
number of active Y molecules than the number of active X molecules. 

To confirm such a dependence, we have computed the dependence of the 
growth speed of a protocell on the numbers of molecules and F^. Here, 
we computed Td, the time required for division, as a function of the number 
of active X and Y molecules at each division. At every division we record 
and to determine the time required for division T^. The division 
speed for a given is computed as the division time for this value of 
(with the bin size used for the histogram), averaged over all values of Ny, 
and similarly for a given Ny. In Fig. 6b, these average division times are 
plotted as functions of and A^^. 

As shown in the figure, the division time is a much more rapidly decreasing 
function of than of A^^. We see that even a slight change in the number 
of active Y molecules has a strong infiuence on the division time of the 
cell. Of course, the growth rate also depends on A^^, but this dependence is 
much weaker. Hence, the growth speed is controlled mainly by the active Y 
molecules. 

In addition, the fiuctuations around this average division time are smaller 
for fixed A^^^. To show this, we have computed the variance for T^{N^) 
and T^{N^). Considering that the variance typically increases in proportion 
to the corresponding average, we rescaled each variance by dividing by the 
corresponding average T^(N^) or T^(Ny ). This scaled variance takes values 
around 0.55 for T^{N^), and around 0.25 for T^{N^). We thus conclude 
that the fiuctuations of T^{Ny) for fixed A^^ are smaller. This implies that 
if A^^ is fixed, fluctuations of the division speed due to changes in A^^ are 
much smaller than the other way around. In other words, the growth speed 
is controlled mainly by A^^. 
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4.2 Preservation of the minority molecule 

As another demonstration of control, we study a model in which there is 
more specific catalysis of molecule synthesis. Here, instead of single active 
molecule types for X and y, we consider a system with k types of active 
X and Y molecules, X'^{i) and (i = 1,2, • • - k). In this model, each 

active molecule type catalyzes the synthesis of only a few types (m < k) of 
the other species of molecules. Graphically representing the ability for such 
catalysis using arrows as ix jy for X ^ Y and iy jx ioi Y ^ X, the 
network of arrows defining the catalyzing relations for the entire system is 
chosen randomly, and is fixed throughout each simulation. An example of 
such a network (that which was used in the simulation discussed below) is 
shown in Fig. 7a. Here we assume that both X and Y molecules have the 
same "specificity" (i.e., the same value of m) and study how this symmetry 
is broken. 

As discussed in §2, when A^", and F satisfy the conditions necessary 
for realization of a state in which Ny is sufficiently small, the surviving cell 
type contains only a few active Y molecules, while the number of inactive 
ones vanishes or is very small. Our simulations show that in the present 
model with several active molecules types, only a single type of active Y 
molecule remains after a sufficiently long time. We call this "remained type" , 
ir (1 < V < Contrastingly, at least m types of X^ species, that can be 
catalyzed by the remaining Y^ molecule species remain. Accordingly, for a 
cell that survived after a sufficiently long time, a single type of Y^ molecule 
catalyzes the synthesis of (at least) m kinds of X molecule species, while 
the multiple types of X molecules catalyze this single type of Y"^ molecules. 
Thus, the original symmetry regarding the catalytic specificity is broken as 
a result of the difference between the synthesis speeds. 

Due to autocatalytic reactions, there is a tendency for further increase of 
the molecules that are in the majority. This leads to competition for repli- 
cation between molecule types of the same species. Since the total number 
of Y molecules is small, this competition leads to all-or-none behavior for 
the survival of molecules. As a result, only a single type of species Y re- 
mains, while for species X, the numbers of molecules of different types are 
statistically distributed as guaranteed by the uniform replication error rate. 

The distribution of A:^(i) species and the growth speed depend on the 
identity of the remained type ir of Y'^{i). In Fig. 7b, we display long-time 
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number distributions of molecules reached from 6 different initial con- 

figurations, with a gray scale plot. The population distribution of Y'^{i) 
molecules satisfies N{Y^{ir)) ^ {2 - 6), and N{Y^{j)) ^ for j ^ v. The 
identity of the remaining type i,. depends on the initial conditions. The num- 
ber distribution of X^{i) and depends strongly on as shown in Fig. 
7b. This strong dependence is expected, since the m types of X molecules 
catalyzed by each active type of Y molecule differ, as determined by the 
catalytic network (Fig. 7a). 

Although X and Y molecules catalyze each other, a change in the type 
of the remaining active Y molecule has a much stronger influence on X 
than a change in the types of the active X molecules on Y, since the num- 
ber of Y molecules is much smaller. Consider, for example, a structural 
change of an active Y molecule from type ir to i'^ (for example, the change 
in polymer sequence) that may occur during synthesis. If such a change oc- 
curs and remains, there will be a composition change from N{Y{ir)) 7^ to 
N{Y{i'j,)) ^ 0. This change will alter the distribution of X{i) drastically, 
as suggested by Fig. 7b. By contrast, a structural change experienced by X 
molecules will have a much smaller influence on the distribution of Y{j). 
(This ignores the case in which many X molecules change to a same type 
simultaneously by replication error, resulting in a drastic change of the dis- 
tribution of X. Such a situation, however, is very rare in accordance with 
the law of large numbers). In fact, there always remain some fluctuations in 
the distribution of X molecules, while the distribution of Y molecules (i.e., 
identity of the remaining type ir) is fixed over many generations, until a 
rare structural change leads to a different remaining type, which may allow 
for a higher growth speed and the survival of the type containing it under 
selection. 

With the results in §3 and §4, we can conclude that the Y molecules, 
i.e., the minority species, control the behavior of the system, and arc pre- 
served well over many generations. We therefore call this state the minority- 
controlled (MC) state. 

4.3 Evolvability of the minority controlled state 

An important characteristic of the MC state is evolvability. Consider a vari- 
ety of active molecules, with different catalytic activities. Then the synthesis 
rates and jy depend on the activities of the catalyzing molecules. Thus, 
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7a; can be written in terms of the molecule's inherent growth rate, g^, and 
the activity, ey{i), of the corresponding catalyzing molecule Y{i): 

Since such a biochemical reaction is entirely facilitated by catalytic activity, 
a change of Cy or e^, for example by the structural change of polymers, will 
be more important. Given the occurrence of such a change to molecules, 
those with greater catalytic activities will be selected through competition 
evolution, leading to the selection of larger Cy and e^- As an example to 
demonstrate this point, we have extended the model in §2 to include k kinds 
of active molecules with different catalytic activities. Then, molecules with 
greater catalytic activities are selected through competition. 

Here, the minority controlled state is relevant to realize evolvability. Since 
only a few molecules of the Y species exist in the MC state, a structural 
change to them strongly influences the catalytic activity of the protocell. 
On the other hand, a change to X molecules has a weaker influence, on the 
average, since the deviation of the average catalytic activity caused by such 
a change is smaller, as can be deduced from the law of large numbers. Hence 
the MC state is important for a protocell to realize evolvability. 

5 Effect of Higher-order Catalysis 

In the first toy model considered in this paper, in order to realize the MC 
state, the difference between the time scales of the two kinds of molecules 
often must be rather large. For example, the ratio Jy/jx should typically be 
less than .05 when the number of molecules is in the range 500 — 2000. (If 
the number is larger, the rate should be much smaller.) This difference in 
growth rates required to realize the MC state is drastically reduced in a model 
that includes higher-order catalytic reaction processes in the replication of 
molecules. 

Consider a rephcation of molecules described by the following: 

X + X' + Y ^2X + X' + ¥;¥ + Y' + X ^2Y + Y' + X. (2) 

In complex biochemical reaction networks, such higher-order catalytic re- 
actions often exist. Indeed, proteins in a cell are catalyzed not solely by 
nucleotides but with collaboration of proteins and nucleotides. Nucleotides, 
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similarly, are catalyzed not solely by proteins but with collaboration of nu- 
cleotides and proteins. 

In the continuum limit, the rate equation corresponding to the reaction 
(2) is given by dNi/dt = j,:N^N^N^ and dN^/dt = -f^N^N^N^. In this 
higher-order catalytic reaction, it is expected that difference between the 
numbers of X and Y molecules is amplified. 

Consider the equation dx/dt = jx^'^y and dy/dt = 'jyu'^x. Then the 
relation 

^1/7. /yihy = const. - xy^Vi/^^^ (3) 

holds, with the initial values Xq and yo (where xq + yo — C). Then, consider 
the following division process: if x{t)+y{t) = 2C, then x ^ x/2 and y ^ y/2. 
With the continued temporal evolution satisfying Eq.(3) and this division 
process, y approaches if 7^ < jx- (Recall that the curve y{x) given by 
Eq.(3) is concave.) 

Now, instead of the differential equations, we again carried out stochastic 
simulations corresponding to the above reaction process (2), using the same 
procedures (iii)-(vii) described in §2. 

The average values of active and inactive molecules obtained in these 
simulations are plotted in Fig.8 as functions of jy. We see that if 'jy/'jx < -93, 
the MC state is reached. Note that here a difference in growth speeds as small 
as about 10% is sufficient to realize the MC state. 

In the present case, if 'Jy/'Jx is less than ~.7, the system comes to exist in 
the state with iV^f = 1, A^"^^ = 0. Since = 1, X molecules are synthesized, 
while Y molecules are not. Accordingly, after a division, only the cell that 
inherits the Y molecule keeps growing. For this reason, the growth of the 
number of such protocells is not possible, and hence a state with such a small 
value of '^y/'^ix is not expected to be reached through evolution. 

The main conclusion of this section is that, when we consider higher- 
order catalysis, the realization of the minority controlled state occurs for a 
wider range of values of ^yj^x- In the above example, a minority controlled 
state maintaining growth is realized for .7 < ^yj^x < -93, while the former 
inequality is always satisfied as long as one considers a cell that continues to 
produce offspring. 
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6 Discussion 



In this paper, we have shown that in a mutually catalyzing system, molecules 

Y with the slower synthesis speed tend to act as the information carrier. 
Through the selection under reproduction, a state, in which there is a very 
small number of inactive Y molecules, is selected. This state is termed the 
"minority controlled state". Between the two molecule species, there ap- 
pears separation of roles, that with a larger number, and that with a greater 
catalytic activity. The former provides a variety of chemicals and reaction 
paths, while the latter holds "information" , in the sense of the two properties 
mentioned in the Introduction, 'preservation' and 'control'. We now discuss 
these properties in more detail. 

[Preservation property]: A state that can be reached only through very 
rare fluctuations is selected, and it is preserved over many generations.^ln 
the theory presented here, the selected and preserved state is one with ^ 
(2 — 10) and Ny ^ 0. The realization of such a state is very rare when we 
consider the rate equation obtained in the continuum limit. For a model 
with several types of both molecule species, the type of active Y molecules 
with nonzero population remains fixed, in spite of the process of stochastic 
fluctuations. 

[Control property]: A change in the number of Y molecules has a stronger 
influence on the growth rate of a cell than a change in the number of X 
molecules. Also, a change in the catalytic activity of the Y molecules has 
a strong influence on the growth of the cell. The catalytic activity of the 

Y molecules acts as a control parameter of the system. For a model with 
several types of each molecule species, X (the majority species) has a smaller 
catalytic activity on the average, and its catalysis is rather specific, only 
acting in the synthesis of a single or a few types of molecules. The minority 
species Y has a greater catalytic ability and acts to catalyze the synthesis of 
many kinds of molecules. Hence a change in Y has a very strong influence. 

With the information carrier defined in terms of the preservation of rare 
states and control of the behavior of the system, we have shown that molecule 
species with slower synthesis speed acts as the information carrier. In this 
way, the generation of information is understood from a kinetic viewpoint. 

Recall also the definition of the information by Shannon [Shannon and Weaver 1949], 
according to which rarer events carry a greater amount of information. 
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Following our result, the separation of the roles of metabolism ("the chicken") 
and information ( "the egg" ) is explained as a general consequence of a cell 
system with mutual catalysis and an appropriate difference between catalytic 
activities (leading to a difference in synthesis speeds). 

Finally, the following question remains: How docs the difference in the 
catalytic activity necessary to realize the MC state generally come to exist? 
Of course, it is quite natural in a complex chemical system that there will 
be differences in synthesis speeds or catalytic activities, and, in fact, this 
is the case in the biochemistry of present-day organisms. Still, it would be 
preferable to have a theory describing the spontaneous divergence of syn- 
thesis speeds without assuming a difference in advance, to provide a general 
model of the possible 'origin' of bio-information from any possible replication 
system. 

To close the paper, we discuss (1) the evolutionary stability and (2) evo- 
lutionary realizability of the MC state. 

One important consequence of the existence of the MC state is evolvabil- 
ity Mutations introduced to the majority species tend to be cancelled out 
on the average, in accordance with the law of large numbers. Hence, the cat- 
alytic activity of the minority species {Y in our model) is not only sustained, 
but has a greater potentiality to increase through evolution. 

Recently, there have been some experiments to construct minimal repli- 
cating systems in vitro. In particular, Matsuura et al. (2001) constructed a 
replication system of molecules including DNA polymerase, synthesized by 
the corresponding gene. Roughly speaking, the polymerase in the experiment 
corresponds to X in our model, while the polymerase gene corresponds to 
Y. In that experiment, instead of changing the synthesis speed 7^ or N, the 
influence of the number of genes is investigated. 

In the experiment, it was found that replication is maintained even un- 
der deleterious mutations (that correspond to structural changes from ac- 
tive to inactive molecules in our model), only when the population of DNA 
polymerase genes is small and competition of replicating systems is applied. 
When the number of genes (corresponding to Y) is small, the information 
containing in the DNA polymerase genes is preserved. This is made possible 
by the maintenance of rare fluctuations, as found in our study. 

As discussed in §4, a change in catalytic activity can be included in the 
model by considering a system with several kinds of active molecules with 
different activities. By considering a mutation from X^(i) to ^^(j) (or 
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F'^(i) to Y^{j)), accompanied by a change in the value of (or Cy), one can 
examine the stabihty of the MC state with respect to mutation. If the initial 
difference between the catalytic abilities Cx and Cy (and other parameters) 
satisfies the conditions stated in §3, the MC state is realized. Then, we 
examined if such a state is destroyed by a change in the catalytic activities 
of molecules. We found that this difference is in fact maintained over many 
generations and that the MC state continues to exist. This behavior is due 
to the fact that a small mutation of Y strongly infiuences the synthesis of X, 
and a mutation resulting in a decrease of Cy is not selected. Hence the MC 
state possesses evolutionary stability. 

The final remaining question we wish to address regards the realizability 
of the MC state in the situation that initially the two molecule species have 
almost the same catalytic activity. One may expect that there would occur 
a divergence of the catalytic activities of two such molecule species, because 
once one species (say Y) has a larger catalytic activity, the number of X 
molecules will increase. This results in Y becoming the minority species, 
which implies that its infiuences on the behavior of the cell will become 
stronger. For this reason, the catalytic activity of Y increases faster than 
that of X, and thus the replication speed of X becomes larger. In this way, 
the difference between replication speeds of X and Y might become further 
amplified. 

While the above argument seems reasonable, it does not hold for our 
model. In simulations including such a structural change, we have not ob- 
served such spontaneous symmetry breaking with regards to the growth 
speeds of the two species, when these species initially have (almost) equal 
catalytic activities. The reason is as follows. In our model, the division of a 
cell is assumed to occur when the total number of molecules becomes double 
the original number. Now, in the model, the collision of two molecules is 
assumed to occur randomly. Hence the probability for a collision leading to 
synthesis of molecules should be proportional to ^(xN^Ny -\-'~fyNxNy, if we as- 
sume a constant proportionality between the numbers of active and inactive 
molecule. Then, note that the quantity N^Ny oc Nx{N — N^) has a peak 
at Nx — Ny. It follows that the growth speed should be maximal when the 
numbers of the two species are equal. Hence there is a tendency toward a 
state in which there are equal numbers of both species. Of course, this argu- 
ment is rather rough, due to the assumption concerning the ratio of active to 
inactive molecule numbers, and the existence of a peak at exactly = Ny 
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may be slightly modified. However, the basic idea here is correct, and there 
is undoubtedly a tendency toward equal numbers. For this reason, a state 
with a large difference is not reached spontaneously through some kind of 
symmetry breaking. 

There are some possible scenarios within which the above described ten- 
dency toward equal growth speeds may be ineffective. 

1. Higher-order catalysis: As mentioned in §4, the imbalance necessary to 
realize a MC state is much smaller when higher-order catalysis is considered. 
Indeed, by introducing the mutation of catalytic activity to the model studied 
in §4, we have sometimes observed spontaneous symmetry breaking between 
the parameters characterizing the two species. The resulting state with a 
sufficient difference between the growth speeds of two species, however, does 
not last very long, since the necessary imbalance between 7^; and 7^^ is so 
small that mutation can reverse the relative sizes of and ^y. 

2. Change in the collision condition:01n our model, collisions of molecules 
occur randomly. Hence if the number of X molecules is larger, most col- 
lisions occur between two X molecules, and no reaction occurs. However, 
if molecules are arranged spatially under different conditions (e.g., consider 
the case in which X molecules are on a membrane and Y molecules are in 
a contained medium), then the number of reaction events between X and Y 
molecules can be increased. If we include this type of physical arrangement, 
which is rather natural when considering a cell, the tendency toward equal 
numbers no longer exists, and the divergence of growth speeds in molecules 
should occur. 

3. Condition for growth: We have assumed that the division of the pro- 
tocell occurs when the total number of molecules doubles. This assumption 
is useful as a minimal abstract model, but it may be more natural to have a 
threshold that depends on the number of molecules of one species (or, more 
generally, of some subset of all species), rather than the total number. For 
example, consider the case that division occurs when the size of a membrane 
synthesized by biochemical reactions is larger than some threshold. This 
condition, for example, could be modeled by stipulating that division occurs 
when the number of molecules of one species, say X, that composes the mem- 

^ Since the scenarios 2 and 3 assume another kind of symmetry breaking between X 
and Y (albeit being different from the synthesis speeds), they cannot provide the final 
solution to the true spontaneous symmetry breaking, although the assumptions may be 
biologically reasonable. 
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brane, is larger than some threshold value. By imposing this type of division 
condition, the tendency toward equal numbers of X and Y molecules could 
be avoided, and the divergence of the rephcation speeds of X and Y could 
take place. 

4. Network structure: The catalytic network in a cell is generally quite 
complex, with many molecules participating in mutual catalysis for repli- 
cation. The evolution of rephcation systems with such catalytic networks 
have been studied since the proposal of the hypercycle by Eigen and Schus- 
ter [1979]. Dyson [1985], on the other hand, obtained a condition for loose 
reproduction of protocells with complex reaction networks consisting of ac- 
tive and inactive molecules. Origin of recursive replication from such loose 
reproduction is also discussed [Segre, Ben- Eli, and Lancet 2000]. 

The differentiation of cells with a catalytic reaction network has also been 
studied [Kancko and Yomo 1997,1999; Furusawa and Kancko 1998]. Here it 
has been found that chemicals with low concentrations are often important in 
differentiation. If the total number of molecules participating in the reaction 
network is small, there should generally exist some species whose numbers 
of molecules are small, and the discrete nature of these numbers plays a 
significant role. For example, Togashi and Kaneko [2001] have found novel 
symmetry breaking that appears in a catalytic reaction system with a small 
total number of molecules. Furthermore, a preliminary study of the reaction 
network version of the model considered in this paper reveals spontaneous 
symmetry breaking that distinguishes a few controlling molecule species from 
a large number of non-controlling species, without assuming a difference in 
synthesis speeds. 

The symmetry breaking by the network structure is related with the evo- 
lution of specificity. Although we have studied catalysis that has no speci- 
ficity (except for the model considered in §4.2), in reality one type of molecule 
can catalyze the synthesis of only a limited number of molecule species. In- 
terestingly, a preliminary study shows a symmetry breaking with regard to 
the roles of molecules (with equal synthesis speeds) when higher-order cat- 
alytic reactions (as in §5) in random networks with catalytic specificity are 
included. 
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Figure 1: Schematic representation of our model 
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Figure 2: Dependence of (^"^)(x), {N^)i+), (iV/)(n), and {N^)i*) on N. 
The parameters were fixed as = 1, 7y = 0.01, and fj, — .05. Plotted are 
the averages of A^^, A^^, N^, and A^^ at the division event, and thus their 
sum is 2A^. In all the simulations conducted for the present paper, we used 
Mtot — 100, and the sampling for the averages were taken over 10^ — 3 x 10^ 
steps, where the number of divisions ranges from 10^ to 10^, depending on 
the parameters. 
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Figure 3: Dependence of (^"^)(x), {N^){+), {N^){0), and {N^){*) on F. 
The parameters were fixed as '^x = 1) 7?/ = -01, ^ = -05, and N = 1000. 
Plotted are the averages of N^, N^, Ny, and Ny at the division event, and 
thus their sum is 2N = 2000. 
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Figure 4: Dependence of {N^){x), {N^){+), {N^){a), and (iV^O(*) on 7r 
The parameters were fixed as 7^; = 1, // = .05, F — 128, and = 1000. 
Plotted are the averages of N^, N^, Ny, and Ny at the division event. 
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Figure 5: Dependence of the active-to-inactive ratio, j]^, on F. The 
parameters were fixed as •jx = 1, 7y = -01, fi = .05, and F = 128. Plots 
for = .005 (O), .01 (+), .015 (□), 0.02 (x), 0.025 (A), and 0.03 (*) are 
overlaid. Plotted are the averages of N^, N^, N^, and Ny at the division 
event. 
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Figure 6: (a) Histograms of the number of division events with given 
(+) and with given Ny (x), plotted versus ^^^^^ and — 2, respectively. 
The histogram representing the averages for was computed with a bin 
size of 10 as discussed in the text. These histograms were found with a 
sampling of divisions occurring between the 6 x 10^ and 10^ time steps, (b) 
Average number of time steps required for the division of protocclls for 
given Ny{x) and (+). These plots were obtained using the histograms in 
Fig.6 (a). 
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Figure 7: (a) Catalytic network between X{i) and Y{i). The arrows from 
the top column points to types of the Y species that each X(i) catalyzes, 
while those from the middle points to types of the X species that each Y(i) 
catalyzes. Each of the A; = 10 active species catalyzes the synthesis of 4 chem- 
icals, (b) The logarithm of the average population of X{i) displayed wish 
a gray scale. From top to bottom, 6 samples resulting from different initial 
conditions are plotted. The type ir of Y{i) with non-vanishing population 
Y{i) corresponding to each column is as follows (top to bottom) ir = 10, 8, 
4, 5, 4, and 2. The parameters were chosen a,s F — 16 x k — 160, // = 0.03, 
7^ = 1, = 0.01, N = 1000. 
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Figure 8: Dependence of {N^){x), {N^){+), (iV/)(n), and (iV^O(*) on -fy. 
The parameters were fixed as 7x- = 1, /U = .05, F = 128, and = 500. 
Plotted are the averages of N^, N^, N^, and Ny at the division event. 
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